THE RELATIVE IMPORTANCE OF INPUT ENCODING AND LEARNING METHODOLOGY ON PROTEIN SECONDARY STRUCTURE PREDICTION by ARNSHEA CLAYTON

نویسندگان

  • Yanqing Zhang
  • Yi Pan
  • Rajeshekhar Sunderraman
چکیده

In this thesis the relative importance of input encoding and learning algorithm on protein secondary structure prediction is explored. A novel input encoding, based on multidimensional scaling applied to a recently published amino acid substitution matrix, is developed and shown to be superior to an arbitrary input encoding. Both decimal valued and binary input encodings are compared. Two neural network learning algorithms, Resilient Propagation and Learning Vector Quantization, which have not previously been applied to the problem of protein secondary structure prediction, are examined. Input encoding is shown to have a greater impact on prediction accuracy than learning methodology with a binary input encoding providing the highest training and test set prediction accuracy. INDEX WORDS: Neural Networks, Protein Secondary Structure Prediction, Input Encoding, Resilient Propagation, Learning Vector Quantization THE RELATIVE IMPORTANCE OF INPUT ENCODING AND LEARNING METHODOLOGY ON PROTEIN SECONDARY STRUCTURE PREDICTION

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Relative Importance of Input Encoding and Learning Methodology on Protein Secondary Structure Prediction

In this thesis the relative importance of input encoding and learning algorithm on protein secondary structure prediction is explored. A novel input encoding, based on multidimensional scaling applied to a recently published amino acid substitution matrix, is developed and shown to be superior to an arbitrary input encoding. Both decimal valued and binary input encodings are compared. Two neura...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Devising and experimenting correlation-based metrics for evaluating the effectiveness of input encoding techniques in prediction tasks

Motivations Defining an optimal encoding for input data is fundamental to achieve high performances in prediction tasks. Its main responsibility is to transform input data to a format suitable for the classification algorithm. The selection of the best encoding is typically done by resorting to the knowledge of a human expert, entrusted with extracting the features that s/he deems useful for th...

متن کامل

TR 07 - 011 fRMSDPred : Predicting local rmsd between structural fragments using sequence information

The effectiveness of comparative modeling approaches for protein structure prediction can be substantially improved by incorporating predicted structural information in the initial sequence-structure alignment. Motivated by the approaches used to align protein structures, this paper focuses on developing machine learning approaches for estimating the RMSD value of a pair of protein fragments. T...

متن کامل

Identification of effectiveness assessment criteria for Seminary virtual courses

The main purpose of this research is to derive Seminary virtual education effectiveness criteria. Seminary education has main differences with tertiary or professional education. Therefore, to assess their effectiveness, we must take into account these differences. In this research we have used qualitative methodology and set a semi-interview mechanism with 15 experts in e-learning, all employe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006